Bayati Graduate School of Business , Stanford University Title : Online Decision Making with High - Dimensional Covariates
نویسنده
چکیده
Growing availability of data has enabled decision-makers to tailor choices at the individual level. This involves learning a model of decision rewards conditional on individual-specific covariates or features. Recently, contextual bandits have been introduced as a framework to study these online decision-making problems. However, when the space of features is high-dimensional, existing literature only considers situations where features are generated in an adversarial fashion that leads to highly conservative performance guarantees–regret bounds that scale by √ n where n is the number of samples. Motivated by medical decision-making problems where stochastic features are more realistic, we introduce a new algorithm that relies on two sequentially updated LASSO estimators. One estimator (with a low bias) is used when we are confident about its accuracy, otherwise a more biased (but potentially more accurate) estimator is used. We prove that our algorithm achieves a regret of order s2 [log n]2+s2 [log n] [log p] where p is the dimension of the features and s is the number of relevant features. The key step in our analysis is proving a new oracle inequality that guarantees the convergence of the LASSO estimator despite the non-i.i.d. data induced by the bandit policy. We also provide a new analysis of the low-dimensional setting that improves existing bounds by a factor p. We illustrate the practical relevance of the proposed algorithm by evaluating it on a warfarin dosing problem. A patient’s optimal warfarin dosage depends on the patient’s genetic profile and medical records; incorrect initial dosage may result in adverse consequences such as stroke or bleeding. We show that our algorithm outperforms existing bandit methods as well as physicians to correctly dose a majority of patients. Based on joint work with Hamsa Bastani [1].
منابع مشابه
Managing Online Auctions: Current Business and Research Issues
Edieal J. Pinker • Abraham Seidmann • Yaniv Vakrat W. E. Simon Graduate School of Business Administration, University of Rochester, Rochester, New York 14627 W. E. Simon Graduate School of Business Administration, University of Rochester, Rochester, New York 14627 McKinsey and Company, Palo Alto, California 94304 and The Graduate School of Business, Stanford University, Stanford, California 943...
متن کاملOnline Decision-Making with High-Dimensional Covariates
Big data has enabled decision-makers to tailor choices at the individual-level. This involves learning a model of decision rewards conditional on individual-specific covariates. In domains such as medical decision-making and personalized advertising, these covariates are often high-dimensional ; however, typically only a small subset of these observed features are predictive of each decision’s ...
متن کاملPrice Protection in the Personal Computer Industry
Hau L. Lee • V. Padmanabhan • Terry A. Taylor • Seungjin Whang Graduate School of Business, Stanford University, Stanford, California 94305 Graduate School of Business, Stanford University, Stanford, California 94305 Department of Industrial Engineering and Engineering Management, Stanford University, Stanford, California 94305 Graduate School of Business, Stanford University, Stanford, Califor...
متن کاملCost Conscious? The Neural and Behavioral Impact of Price Primacy on Decision Making
Vol. LII (August 2015), 467–481 467 © 2015, American Marketing Association ISSN: 0022-2437 (print), 1547-7193 (electronic) *Uma R. Karmarkar is Assistant Professor in Marketing, Harvard Business School, Harvard University (e-mail: [email protected]). Baba Shiv is the Sanwa Bank, Limited, Professor of Marketing, Graduate School of Business, Stanford University (e-mail: [email protected]...
متن کاملTransactions Costs and Portfolio Choice in a Discrete - Continuous Time Setting
This paper makes the following observation concerning a new formulation of the consumption and portfolio choice model of Merton (1971), with transactions costs. Suppose an investor observes his or her current wealth only when making a transaction, that transactions are costly, and that decisions to transact can be made at any time based on all current information. If, at each transaction, the a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016